A senone based confidence measure for speech recognition

نویسندگان

  • Zachary Bergen
  • Wayne H. Ward
چکیده

This paper describes three experiments in using frame level observation probabilities as the basis for word confidence annotation in an HMM speech recognition system. One experiment is at the word level, one uses word classes, and the other uses phone classes. In each experiment we categorize hypotheses into correct and incorrect categories by aligning a best recognition hypothesis with the known transcript. The confidence of error prediction for each class is a measure of the resolvability between the correct and incorrect histograms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vocabulary-independent word confidence measure using subword features

This paper discusses how to compute word-level confidence measures based on sub-word features for large-vocabulary speaker-independent speech recognition. The performance of confidence measure using features at word, phone and senone level is experimentally studied. A framework of transformation function based system using sub-word features is proposed for high performance confidence estimation...

متن کامل

Learning small-size DNN with output-distribution-based criteria

Deep neural network (DNN) obtains significant accuracy improvements on many speech recognition tasks and its power comes from the deep and wide network structure with a very large number of parameters. It becomes challenging when we deploy DNN on devices which have limited computational and storage resources. The common practice is to train a DNN with a small number of hidden nodes and a small ...

متن کامل

Rapid adaptation for deep neural networks through multi-task learning

We propose a novel approach to addressing the adaptation effectiveness issue in parameter adaptation for deep neural network (DNN) based acoustic models for automatic speech recognition by adding one or more small auxiliary output layers modeling broad acoustic units, such as mono-phones or tied-state (often called senone) clusters. In scenarios with a limited amount of available adaptation dat...

متن کامل

Predicting unseen triphones with senones

In large-vocabulary speech recognition, the decoder often encounters triphones that are not covered in the training data. These unseen triphones are usually represented by corresponding diphones or context independent monophones. We propose to use decision-tree based senones to generate needed senonic baseforms for unseen triphones. A decision tree is built for each individual Markov state of e...

متن کامل

Efficient data selection for speech recognition based on prior confidence estimation using speech and context independent models

This paper proposes an efficient data selection technique to identify well recognized texts in massive volumes of speech data. Conventional confidence measure techniques can be used to obtain this accurate data, but they require speech recognition results to estimate confidence. Without a significant level of confidence, considerable computer resources are wasted since inaccurate recognition re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997